Genotyping and Molecular Characterization of VP6 and NSP4 Genes of Unusual Rotavirus Group A Isolated from Children with Acute Gastroenteritis

Group A rotavirus (RVA), which causes acute gastroenteritis (AGE) in children worldwide, is categorized mainly based on VP7 (genotype G) and VP4 (genotype P) genes. Genotypes that circulate at <1% are considered unusual. Important genes also include VP6 (genotype I) and NSP4 (genotype E). VP6 establishes the group and affects immunogenicity, while NSP4, as an enterotoxin, is responsible for the clinical symptoms. The aim of this study was to genotype the VP6 and NSP4 genes and molecularly characterize the NSP4 and VP6 genes of unusual RVA. Unusual RVA strains extracted from fecal samples of children ≤16 years with AGE were genotyped in VP6 and NSP4 genes with Sanger sequencing. In a 15-year period (2007–2021), 54.8% (34/62) of unusual RVA were successfully I and E genotyped. Three different I and E genotypes were identified; I2 (73.5%, 25/34) and E2 (35.3%, 12/34) were the most common. E3 genotype was detected from 2017 onwards. The uncommon combination of I2-E3 was found in 26.5% (9/34) of the strains and G3-P[9]-I2-E3 remained the most frequent G-P-I-E combination (20.6%, 7/34). Children infected with RVA E2 strains had a statistically higher frequency of dehydration (50%) than those infected with RVA E3 strains (p = 0.019). Multiple substitutions were detected in NSP4, but their functional effect remains unknown. The result indicates the genetic diversity of RVA strains. Continuous surveillance of the RVA based on the whole genome will provide better knowledge of its evolution.


Introduction
Group A rotavirus (RVA) is one of the most common etiological agents of severe acute gastroenteritis (AGE) in infants and young children, especially in developing countries.Children with RVA AGE can present severe dehydration that can even lead to death if left untreated.RVA is responsible for more than 100,000 deaths each year worldwide [1].
Te viral particles consist of a triple-layered capsid.Te outer capsid consists of the glycoprotein VP7 and the spike protease-sensitive protein VP4.Te middle layer consists of VP6 and the core layer comprises of VP2 which encapsulates genomic RNA and viral replication components [2].Te abundant VP6 protein is commonly used for the detection and classifcation of rotaviruses.Currently, nine rotavirus species (A-D and F-J) have been recognized by the International Committee on Taxonomy of Viruses (ICTV); however, two additional putative rotavirus species K and L have also been proposed [3].Only species A, B, C, and H can infect humans, including animal-human transmission [4].
Te gene NSP4, except for the full-length protein, encodes a toxic peptide (114-135 amino acids) and both act as enterotoxins that can stimulate Ca 2+ release from the endoplasmic reticulum into the cytoplasm [8,9].NSP4 enterotoxin activity aggravates the symptoms of gastroenteritis, particularly diarrhoea and vomiting [10,11].
Since 2006, several RVA vaccines have been released worldwide.Te most widely used are the two-dose monovalent vaccine Rotarix (GlaxoSmithKline Biologicals, Belgium) and the three-dose pentavalent vaccine RotaTeq (Merck, United States), which cover the most common G and P genotypes [12,13].After their release, notable changes in genotype distribution have been described worldwide, such as the increase in unusual G and P genotypes [2,14].Te aim of this study was the genotyping and molecular characterization of VP6 and NSP4 genes of previously described [15] unusual G and P RVA strains isolated from children ≤16 years with AGE to fnd out the virulence of these strains and their possible coverage by circulating vaccines against rotavirus.

Study Design.
Tis is a retrospective study involving RVA-positive fecal samples with previously described unusual G (G6, G8, and G10) and/or P (P [6], P [9], P [10], P [11], and P [14]) genotypes collected from children ≤16 years hospitalized with AGE [15].Children were admitted to "Aghia Sophia" Children's Hospital with AGE symptoms during 2007-2021.Fecal samples were collected within the frst two days after admission and the RVA antigen detection test (Rota-Adeno Combo Rapid kit, Hangzhou Alltest Biotech Co., China) was performed during the frst eight hours of sample collection.Positive RVA fecal samples were stored properly at 2-8 °C and were genotyped within fve days [15].
Te G and P genotyping of the isolated strains was performed as previously described [15] using agarose gel electrophoresis and sanger sequencing, as part of the annual surveillance of circulating RVA, in the Infectious Disease and Chemotherapy Research Laboratory of National and Kapodistrian University of Athens, Greece.As unusual RVA genotypes were defned as the G and P genotypes that circulate in a very low percentage (≤1%) in the population and/or have an animal origin, the unusual RVA genotypes are not included in the vaccines, and hence their coverage remains unknown.Besides, their virulence is also important to be examined.In the present study, these strains were further genotyped in VP6 (I genotype) and NSP4 (E genotype) genes.
Te Scientifc and Bioethics Committee of "Aghia Sophia" Children's Hospital approved this study (no.6261).

VP7 and VP4
Genotyping.Nucleic acid extraction and reverse transcription (RT) were performed as previously described [15] within the frst fve days after sample collection.In brief, positive RVA samples were isolated using MagNA Pure Compact Nucleic Acid Isolation Kit I (Roche Diagnostics, Basel, Switzerland) on a MagNa Pure Compact instrument following the manufacturer's instructions.Te extracted RNA was reverse transcribed using the Transcriptor frst-strand cDNA synthesis kit (Roche Diagnostics, Basel, Switzerland).VP7 and VP4 genes were amplifed using specifc primers according to the European Rotavirus Network Detection and Typing Methods (European Rotavirus Network 2009).Sanger sequencing was performed using the BigDye Terminator v3.1 cycle sequencer kit on an Applied Biosystems 3500 genetic analyzer (Applied Biosystems, Waltham, MA, USA) and the BLAST bioinformatics tool (https://blast.ncbi.nlm.nih.gov/Blast.cgi) was used for G and P genotyping.

Amplifcation of VP6 and NSP4
Genes.PCR amplifcation was performed using the GoTaq DNA polymerase (Promega, Madison, WI, USA) and the primers F: 5′-GAC GGV GCR ACT ACA TGG T-3′ plus R: 5′-GTC CAA TTC ATN CCT GGT G-3′ for the VP6 gene and F: 5′-GGC TTT TAA AAG TTC TGT TCC GAG-3′ plus R: 5′-GTC ACA YTA AGA CCR TTC CTT CCA T-3′ for the NSP4 gene [17,18].Te PCR amplifcation was carried out with an initial denaturation at 94 °C for 2 minutes (min), followed by 40 cycles of denaturation for 1 min at 94 °C, annealing for 1 min at 55 °C for the VP6 gene and 48 °C for the NSP4 gene, extension for 1 min at 72 °C, and fnal extension for 10 min at 72 °C.Te amplifcation products were analyzed by 2% agarose gel electrophoresis using a 50 bp DNA ladder (N3236S; New England Biolabs, Massachusetts, USA) and ethidium bromide staining.If PCR products did not exist in electrophoresis gel, the genotype was characterized as unidentifed (UD).Te VP6 and NSP4 sequences were compared with reference strains from the Wa, DS-1, and AU-1 constellations depending on the I and E genotypes to detect substitutions.Te reference strains K02086.1 (Wa), DQ870507.1 (DS-1), and DQ490538.1 (AU-1) were used for the molecular characterization of the VP6 sequence, and the AF093199.1 (Wa), EF672582.1 (DS-1), and D89873.1 (AU-1) were used for the characterization of the NSP4 sequence.
Phylogenetic evolutionary analysis was performed on VP6 and NSP4 genes using the MEGA 11 software (Molecular Evolutionary Genetics Analysis; https://www.megasoftware.net).Multiple sequence alignment was performed using MUSCLE software (Multiple Sequence Comparison by Log-Expectation).Te nucleotide substitution model was selected based on the BIC (Bayesian information criterion) scores using MEGA11.Te model used in this study was the Tamura 3-parameter (T92) and the rate variation model, which allows some sites to be evolutionarily invariant (+I).Te evolutionary tree was constructed using the maximum likelihood method and bootstrap resampling with 1000 replicates.

Statistical Analysis.
Data statistical analysis was carried out using SPSS software (IBM Statistical Package for Social Sciences for Windows, Version 25.0., IBM Corp, Armonk, NY).A p value of ≤0.05 was considered statistically signifcant.Diferences among variables were assessed using Pearson's chi-square (χ 2 test).Fisher's exact test was used to analyze two or more categories or when the criteria for the χ 2 test were not met.

Association of I and E Genotypes with Patient
Characteristics.Statistical analysis of demographic, clinical, and laboratory data from children depending on RVA I genotype showed no signifcant association (Supplementary Tables 2 and 3).Te corresponding analysis on E genotypes showed that children infected with E2 RVA strains had a higher relative frequency of dehydration (6/12, 50%) than those with the E3 genotype (0/9, 0%) (p � 0.019).Vesikari severity score was estimated for 30/34 patients and is shown in Supplementary Table 1.Symptoms of severe gastroenteritis were seen in 16/30 (53.3%) children.
In the sequenced fragment of the VP6 gene, a part of the antigenic region III (aa 208-274) was included.Genetic analysis revealed the existence of three already known substitutions; the homozygous Y248F carried by all I1, Wa, and Rotarix strains, homozygous V252I carried by all I1 strains, and the homozygous I253V carried by I3 (n � 1) and AU-1 strains.
Phylogenetic analysis of the VP6 gene in 34 unusual RVA strains revealed three distinct groups corresponding to I1, I2, and I3 genotypes with 100% reliability for the I1 and I3 groups and 84% reliability for the I2 group.Among the unusual RVA strains carrying the I2 genotype, two distinct Advances in Virology clades (I2-A and I2-B) were identifed with 95% and 79% reliability, respectively (Figure 2).

Molecular Characterization and Phylogenetic Analysis of NSP4.
Molecular characterization was performed on the whole NSP4 gene.Trough this comparison, 13 homozygous missense substitutions were found in strains carrying the E1 genotype, 21 homozygous and two heterozygous missense substitutions in strains carrying the E2 genotype, and 16 homozygous and one heterozygous missense substitution in strains carrying the E3 genotype (Figure 3).Most of these substitutions (n � 23) were located in the VP4-binding region (aa 112-148).
Phylogenetic analysis of the NSP4 gene in 28 unusual RVA strains revealed three distinct groups corresponding to E1, E2, and E3 genotypes with 100% reliability.Among the unusual RVA strains carrying the E2 genotype, three distinct clades (E2-A, E2-B, and E2-C) were identifed (Figure 4).Reference strains are indicated by a colored circle.Te tree was constructed using the maximum likelihood method and T92 + G + I model [19].Bootstrap values (1000 replicates) above 70% are shown.Te scale bar indicates the branch length for 0.20 substitutions per nucleotide position.Evolutionary analysis was conducted with MEGA 11 software [20].

Discussion
Tere are limited studies that investigate the molecular characterization of VP6 and NSP4 genes in human RVA strains worldwide as the interest has mainly focused on G and P distribution.Tis 15-year study focusses on the genotyping and molecular characterization of VP6 and NSP4 genes of unusual G and P RVA strains isolated from children hospitalized with AGE.
Te rare combination of G3-P[9]-I2-E3 that was detected in this study is possibly derived from a reassortment event, but further investigation should be performed.Reassortment is common among RVs and is a crucial mechanism for the evolution of the virus.Molecular characterization of multiple RVA genes is important, as it may contribute to detecting strains that do not ft into any of the major constellations (Wa, DS-1, and AU-1) and are probably products of reassortment events.Furthermore, this fnding supports that VP6 and NSP4 can segregate independently, contradicting a study in 2003 that reported a genetic linkage among these two proteins in common, unusual, and reassortant human strains [22].Similarly to our observation, many studies reported such reassortment events at VP6 and NSP4, but at a lower rate.In an 11-year study (1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) in Brazil, the I1-E2 unusual I-E genotype combination was found in 1.2% of circulating strains [24].Te combinations I2-E1 and I1-E2 were detected in 15.4% of RVA strains in India during 1990-2000 [28] and in 6.5% in Iran during 2021-2022 [29].
Multiple amino acid substitutions were detected in both VP6 and NSP4 genes.While many of these variants own key positions in the proteins, their functional impact remains unknown.VP6 protein is crucial as it is used in molecular and serological diagnostic tests for RVA due to its high conservation [30,31].Te genetic analysis in this study showed that I2 was more conserved compared to I1 and I3, since only 16% of the I2 strains carried substitutions.Tis fnding is in concordance with a similar study in South Africa, in which only I1 and I2 genotypes were described and I2 was found more conserved than I1 as it carried only two substitutions [31].
VP6 also contains four major antigenic regions [32].Nyaga et al. described many substitutions in I1 antigenic region III, three of which (Y248F, V252I, and I253V) were Te tree was constructed using the maximum likelihood method and T92 + I model [19].Bootstrap values (1000 replicates) above 70% are shown.Te scale bar indicates the branch length for 0.20 substitutions per nucleotide position.Evolutionary analysis was conducted with MEGA 11 software [20].
Advances in Virology also presented in our samples, but their functional efect is unknown [31].Changes at the antigenic regions should be closely monitored since it could potentially afect the efciency of the rotavirus detection methods and the future development of a VP6-based vaccine as it also induces the development of neutralization antibodies such as the capsid proteins VP7 and VP4 [33].
NSP4 is an essential protein for virus morphogenesis and pathogenesis.In the present study, nine possibly novel substitutions were found in the NSP4 gene.Most substitutions were detected in the VP4-binding domain, which also contains the toxic peptide and the interspecies variable domain (ISVD).According to other studies characterizing the nucleotide sequence of the NSP4 gene, the ISVD region shows great heterogenicity and the amino acids vary according to genotype [18,[34][35][36][37]. Limited functional studies exist and therefore the efects of these variants on the functionality and immunogenicity of the corresponding protein remain unknown.
Of interest are the substitutions in amino acid 131 in the region of the toxic peptide, in which the majority of the strains of this study carried the H131 and E2 strains mainly carried Y131.Ball et al. conducted a functional study for this amino acid on infant mice and they found that substitution in amino acid 131 has an efect on the enterotoxin properties of NSP4 [38].Specifcally, they reported that the Y131K substitutions resulted in the absence of diarrhoea.Studies from Brazil between 1990-2000 and 1987-2003 have reported that Y131 was detected only in E2 strains, while E1 strains had H131, and there were no data regarding E3 strains [39,40].Srivastava et al. showed that patients infected with a strain carrying Y131 experienced more severe diarrhoea [34].Even though the severity of symptoms was not evaluated in the present study, statistical analysis showed that children infected with an unusual strain carrying the E2 genotype had a higher chance of exhibiting dehydration, which may indicate more severe diarrhoea.Tis result may also be related to the fact that Y131 was detected more in E2 strains.
Limitations of the present study included the moderate detection rates of both VP6 and NSP4 genes in RVA-positive fecal samples.However, similar detection rates have also been reported in other studies, possibly due to poor sample storage conditions or the presence of RNases resulting in fragmentation of the viral RNA genome, presence of PCR inhibitors, or inability of primers to hybridize [8,41].Another limitation of our study was that the analysis was based only on four genes (VP7, VP4, VP6, and NSP4) and not on the complete genotype constellation, which would provide more information about the genetic evolution of the strains.Future prospective studies are necessary to confrm our fndings.
Tis is the frst study of VP6 and NSP4 epidemiology and molecular characterization of unusual RVA strains in Greece, in which the unusual I3 and E3 genotypes, the reassortant I2-E3 human strains, and many substitutions in signifcant domains of NSP4 gene were detected.Furthermore, a signifcant clinical association between dehydration and E2 genotype was described.
Continuous surveillance of the distribution of RVA genotypes based on the whole genome, molecular characterization, and their association with epidemiological and clinical data is important for the better knowledge of the virus' evolution, disease prognosis, and upgrading RVA vaccines.

Conclusions
In this study, the genotype distribution of the VP6 and NSP4 genes in unusual rotavirus strains was described.Te association between the RVA genotype and the severity of the symptoms needs to be further investigated.Te application of next-generation sequencing to investigate genotypic combinations in the complete viral genome in combination with phylogenetic analysis will probably provide answers to the origin and evolutionary relationship of these strains.

Figure 1 :
Figure 1: I-E genotype annual distribution of 34 unusual rotavirus group A strains isolated from children aged ≤16 years hospitalized with acute gastroenteritis during 2007-2021.EUD, unidentifed E genotype.

Figure 2 :
Figure 2: Phylogenetic tree of VP6 gene of unusual rotavirus group A strains (n � 34) circulating in Greece between 2007 and 2021.Reference strains are indicated by a colored circle.Te tree was constructed using the maximum likelihood method and T92 + G + I model[19].Bootstrap values (1000 replicates) above 70% are shown.Te scale bar indicates the branch length for 0.20 substitutions per nucleotide position.Evolutionary analysis was conducted with MEGA 11 software[20].

Figure 4 :
Figure4: Phylogenetic tree of NSP4 gene of unusual RVA strains circulating in Greece between 2007 and 2021.Reference strains are indicated by a colored circle.Te tree was constructed using the maximum likelihood method and T92 + I model[19].Bootstrap values (1000 replicates) above 70% are shown.Te scale bar indicates the branch length for 0.20 substitutions per nucleotide position.Evolutionary analysis was conducted with MEGA 11 software[20].

Table 1 :
Unusual G-P-I-E genotypes of group A rotaviruses circulating in Greece in a 15-year period (2007-2021).

Table 2 :
Signifcant substitutions observed in the NSP4 gene.